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1, INTRODUCTION 

World Health Organization (WHO) has classified the hearing impairment into 5 categories which 
are type 0 or no impairment (with < 25 dB mean hearing loss), type 1 or slight impairment (with 26-40 dB 
mean hearing loss), type 2 or moderate impairment (with 41-60 dB mean hearing loss), type 3 or severe 
impairment (with 60-80 dB mean hearing loss) and type 4 or profound impairment (with > 81 dB mean 
hearing loss) [1]. Type | and type 2 are treatable using surgery, while type 2 and type 3 require the help of 
hearing aids to enable hearing process to occur. However, both surgical and hearing aids are unable to 
overcome type 4 hearing impairment. To enable communication for people with type 4 hearing loss to occur, 
cochlear implant, lip-reading and employment of sign language can be used [1]. Statistics from [2] estimated 
16 millions (12-26 millions) children had hearing impairment excluding type 0 (= 35 dB) in 2011 and it 1s 
expected to increase each year. The cost for hearing impairment treatment for low-income countries is high 
[3], therefore sign language is usually become the main chosen solution for communication for those who are 
having hearing loss. 

Children with hearing impairment problem usually have the communication difficulty with other 
people. They are more likely to have negative behaviours as they become frustrated when their message is 
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not understood by other people causing them to be hot-tempered and possess rage at all time [4-7]. These 
negative behaviours are usually occurred when is the children are trying to express their basic needs such as 
hunger, thirst or drowsiness to their parents. If they fail to get what they want, they will become agitated and 
cry all the time. Normally parents will send these children to attend a formal sign language classes or hire a 
professional babysitter to look after their children. However, for those who are from low-income family, they 
cannot afford to pay high salary of the babysitter and high course fee. As the consequence, these low-income 
parents will choose neither of these two instead they neglect their children causing them to lose the 
opportunities for learning. 

It is important for all of us to learn baby sign language as baby sign language is the powerful mean 
of communication among children, not only for babies with profound hearing impairment problem but also 
for normal babies. For normal infants, learning baby sign language will help the infants below two years old 
to express their needs [8-9] before they can learn to speak properly. In addition, research from [10] has stated 
that about 20% of babies with hearing problem have parents who can hear normally. Thus, there is a small 
probability for everyone to have a child with hearing impairment. Due to its importance, this research 
focusses on learning baby sign language among adults, so that it can be used to teach the infants. 

Nowadays, there are many kinds of medium of interactive learning such as the Internet and mobile 
apps which can help public people to learn the baby sign language by themselves without attending any baby 
sign language learning course. One of the sign language translation system has been developed for Arabic 
sign language which can be accessed on mobile devices which translating text into sign language animation 
[11-12]. In addition to this, there is also a development of sign language translation for learning English sign 
language [13] and learning Indian sign language [14]. Handheld mobile system has also been created for 
learning Arabic sign language where users can refer graphic applications to learn sign language [15]. There 
was a research done to develop sign language animation on mobile devices for learning Chinese sign 
language by providing 3-D virtual human model to teach the sign language [16]. However, there are still 
some limitations on the system due to the development of the translation system that 1s specific for certain 
country’s language. 

With the help of hand gestures to learn baby sign language, children especially those with profound 
hearing impairment problem can communicate with other people more naturally and expressively. This is 
because the hand gesture from them will be detected and translated automatically. It takes a great effort for 
children with profound hearing impairment to make a word and sentence to communicate to their parents. The 
recognition of finger detection for alphabets for Spanish sign language using contour analysis-based 
computer vision with input glove on the hand has found to be suitable for most letters as it can reach 99 % of 
accuracy for some of the letters [17]. Boundary-trace based finger detection technique for alphabets of the 
American Sign Language (ASL) without any input of such special markers or gloves on the hand has found 
that it works accurately for 95% of the letters tested in the research work [18]. However, these techniques can 
only translate the alphabets. 

Since learning hand gesture for baby sign language is good to understand infants especially for those 
who require special need, a real-time dynamic hand gestures recognition system 1s developed. Dynamic hand 
gesture recognition by using contour analysis to recognise Arabic language has 85.67% recognition rate 
where the testing has been done by different person with the same intensity of light [19]. A dynamic hand 
gesture recognition by using image processing module which developed in OpenCV has found that 9 of I1 
hand gestures has worked properly with an efficiency of 70% with the same background colour [20]. The 
gesture recognition system by using Support Vector Machine (SVM) has found that the developed system is 
able to recognise the sign language accurately and it has to be in white background [21]. A few real-time 
recognition systems have been developed but the suitable background colour has not yet been studied. 

The major objectives of this research is to develop an interactive learning of baby sign language that 
can be learnt effectively using Microsoft Foundation Class (MFC) Application and to investigate the 
performance of dynamic hand gesture recognition system in terms of accuracy among three main races in 
Malaysia. This project is primarily concerned on the motion hand gesture detection. The scopes of this study 
are about three basics baby sign language used such as “mom’’, “eat” and “milk”. In this research, a built-in 
web-camera of a laptop was used to capture the image frame of the sign language. Two pre-setting 
experiments have been carried out which were testing on different light condition and different background 
color to gain the highest accuracy rate. 

The rest of the paper is organised as follows. The proposed architecture for image processing 1s 
discussed in detail in Section 2. In section 3, the results of the proposed system are discussed. Section 4 
concludes the paper. 
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2. RESEARCH METHOD 

A system of real-time dynamic hand gestures recognition is contributed by four major parts which 
are; (1) a regular web-camera, (2) Microsoft Visual Studio 2013, (3) OpenCV 2.4.11 and (4) MFC 
Application. Figure | shows the proposed system architecture of the dynamic hand gestures. Firstly, an image 
frame is captured from the laptop web-camera. Next, the image frame is passed to the image processing 
module consists of the image filtering, image pre-processing and convex hull and convexity defects 
computation before being classified by using Microsoft Visual Studio 2013 and OpenCV. Finally, MFC 
Library is used to generate the applications in stable release state without running the program codes 
repeatedly. In this project, MFC application is used to visualise the real-time dynamic hand gestures and 
translate the gestures into appropriate meaning. The detail of image processing module is discussed in the 
following subsections. 
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Figure 1. The flowchart of the dynamic hand gesture recognition 
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2.1. Image Acquisition and Skin Colour Modelling 

Image Acquisition stage is the first stage of any vision system. In this stage, the image 1s retrieved 
from hardware-based source then passed through the other stage. The detected face colour is used to build the 
skin colour model. Skin colour model is more reliable as it is suitable for people with different skin colours 
and in different lightning conditions. The major factor that influences the image acquisition is the initial setup 
of the hardware that is used to capture the images. If the initial setup of the hardware is in low quality, the 
captured image may not be salvaged properly. Improper configuration and alignment of the software will 
then produce visual artifacts which can complicate the image processing [22]. 


2.2. Image Filtering and Pre-Processing 

Image processing is a method to perform some operations on an image to obtain an enhanced image 
or extract some useful information from it. In this process, the input is the image whereas the output 1s the 
characteristics or features associated with the image. Image processing basically consists of three steps which 
are; importing the image via image acquisition tools, analysing and manipulating the image and producing an 
altered image or report which is based on the analysed image. The main process in this step is to only detect 
the hand region by applying skin colour detection. Thus, the face region which is detected before will be 
removed to avoid the system to be confused. 


2.3. Contours Convexity Hull and Convexity Defects Computation 


The contour which is linked to the next logical step 1s steeply encoded about the location of the next 
point on the curve to comprehend the shape of the processed image [23-24]. Figure 2 shows the image of 
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hand with convex hull and convexity defects. Red line represents the convex hull of hand whereas the 
double-sided arrow represents the convexity defects [25]. The number of fingers is used to select specific 
actuator in the manipulator. 





Figure 2. Image of hand with convex hull (red line) and convexity defects (doubled-sided arrow) [24] 


Hand shape can be determined by finding the largest contour of the hand. Firstly, the palm centre 1s 
detected to find the maximum area of the palm of the hand. A blue circle will be formed to show the founded 
palm region whereas a white spot represents the detected palm centre. Then, the convex hull and convexity 
defects representing the gaps between fingers are detected to determine the number of fingers. The red spot 
resembles the detected convex hull whereas the light blue spots represent the detected gaps between 
the fingers. 

The segmentation of hand motion can be done via motion analysis. Motion analysis enables any 
movement in the video scene to be detected. Hence, a uniform and static background should be set up to 
avoid the system from confusion and remove the noise. In this project, the centre of the hand which was 
detected in each frame has allowed its position to calculate the centre in each frame by calculating the tangent 
of the angle between the horizontal line and the line joining at the centre in consecutive frames. This enabled 
the system to identify the direction of the motion of the hand. Motion analysis then extracted the feature to 
obtain the temporal information and made decision based on the detected hand gestures. 

The function mask is commonly used to copy the desired information of the hand gesture on the 
location of the convex hull, convexity defects, palm center, number of fingertips and the direction of hand 
movement. This region is usually set as the desired region whereas the other region will not be copied. The 
recorded information is set as the reference of the hand gesture and computed with the other hand gesture 
during the testing phase. Figure 3 shows the detected face image and the location of the convex hull for sign 
“mom”, “eat” and “milk”. The system recognises the hand gesture as “mom” when it detects five fingertips, 
four convexity defects and the distance becomes closer to the face region. For “eat” recognition, the detected 
fingertips should be more than three, the detected convexity defects should be less than three, and the 
distance between hand and face should be less than 5 cm. Whereas for the “milk” recognition, the system has 
to detect zero fingertips. 





(a) (b) (©) 


Figure 3. Recognition of hand gesture (a) mom, (b) eat and (c) milk 


3. RESULTS AND ANALYSIS 

A total of 30 datasets of different races in Malaysia was collected. The datasets were 10 sets for 
Malay, 10 sets for Chinese and 10 sets for Indian. These 3 main races in Malaysia have been selected to 
represent 3 different skin colours. 15 datasets were used for training and the remaining 15 datasets were used 
for testing. Figure 4 shows three signs of baby sign language that are used in this project which are “mom”, 


Indonesian J Elec Eng & Comp Sci, Vol. 18, No. 1, April 2020 : 361 - 367 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 O 365 


“eat” and “milk”. The distance between the laptop and the user is set at 0.3 m. Three experiments were 
conducted to test the system accuracy in terms of; to investigate the effect of different brightness and 
different background colour and the overall performance of the developed system based on different skin 
colour. The accuracy is calculated by using the following formula: 


Number of test sample that is correctly classified 
Total test sample 


Accuracy = 





(b) 


Figure 4. Three selected signs of baby sign language which are: (a) mom, (b) eat and (c) milk [26] 


3.1. The Effect of Different Brightness 

In brightness test, the accuracy of the system to recognise the hand gesture was tested in common 
lighting room condition and compared with a setup of external lamp which was placed in front of the laptop. 
Table 1 shows the result of the hand gesture recognition accuracy in brightness test. It shows that the system 
is able to recognise the hand gesture more consistent when it is tested with a setup of external lamp that 
obtained 93.33% average accuracy than the common lighting with accuracy of only 88.89%. It is because 
when there is an external lamp placed in front of the user, the hand gesture can be captured more clearly by 
the system. 


Table 1. Hand Gesture Recognition Accuracy in Brightness Test 


Brightness condition Hand gesture recognition accuracy (%) Average 
Mom Eat Milk accuracy (%) 

Common lighting room condition 100 100 66.67 88.89 

External lamp added in front of the laptop 96.67 83.33 100 93.53 


3.2. The Effect of Different Background Colour 

In background colour test, five types of colour background were selected which included blue, 
yellow, grey, green and white to determine the best background colour for the system. Table 2 concludes that 
the system performs the best in blue background which is 94.44% accuracy whereas the colour performs the 
worst in yellow background which is 0% of the accuracy. This huge difference in accuracy is because the 
yellow colour is actually almost similar to the skin colour tone of human being causing the system to be 
confused in detecting the face and hand regions. Table 2 shows that green and white backgrounds have the 
same result of the accuracy which 1s 92.22% whereas the grey background shows 90% accuracy of the 
gesture recognition. The result proves that the system is able to recognise the hand gesture at different colour 
background except yellow and the best setup 1s in blue colour background. 


Table 2. Hand Gesture Recognition Accuracy in Background Colour Test 


Background Hand gesture recognition accuracy (%) Average 
colour Mom Eat Milk accuracy (%) 
Blue 100 83.33 100 94.44 
Yellow 0 0 0 0 

Grey 100 100 70 90 
Green 93.33 83.33 100 92.22 
White 93.33 83.33 100 92.22 
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3.3. The Overall Performance Comparison on Different Skin Colours 

To evaluate the performance of the system on different skin colours, the background colour was 
fixed with blue background and with external lamp in front of the laptop. Table 3 shows a summarised result 
of the effectiveness of the system on different skin color. The result shows that the system can recognise the 
hand gesture appropriately with different skin colours. Accuracies of 96.33%, 85.78% and 91.78% of the sign 
language recognition are yielded for hand gesture “Mom”, “Eat” and “Milk” respectively. The lowest 
accuracy 1s produced by the sign “eat” followed by the sign “milk’’. It is due to the difficulty of finding the 
contour and convexity defects of these sign language if compared to “mom” sign. Furthermore, Malays, 
Chineses and Indians show 91.33%, 93.11% and 89.44% hand gesture recognition accuracies respectively. 
Hence, this system is suitable for all races with different skin colours as the implemented skin colour 
modelling has. 


Table 3. Accuracy of Hand Gesture Recognition among Different Races 


Races Hand gesture recognition accuracy (%) Average accuracy 
Mom Eat Milk (%) 
Malays 96.33 86.67 91 91.33 
Chinese 96 90 93.33 93.11 
Indian 96.67 80.67 91 89.44 
96.33 85.78 91.78 


4. CONCLUSION 

The aim of the present research is to develop a real-time hand gesture recognition system for baby 
sign language. Two preliminary system setting investigations have been carried out to discover the effect of 
different light intensity and different background colour. The results have shown that the system has 
performed the best when an external is added in front of the webcam for the former experiment. Meanwhile, 
blue colour background has shown a stable average accuracy rate for the latter experiment. From these 
preliminary setting experiments, they also indirectly expose that the moving background should be avoided 
to reduce the misclassified hand gesture. However, this study was limited to certain participants since it did 
not include the children with hearing impairment and babies below 2 years datasets to evaluate the 
performance of the system towards children. Notwithstanding the limitation sample of children dataset, the 
system is still applicable to be used as a tool for adult to learn baby sign language. A further study is 
suggested to test the system by using different age of children with hearing impairment and normal infants 
below 2 years. In addition, the system can be upgraded into mobile phone apps to make it more portable than 
the current system and ease the process of image transfer since the image obtained by high end smart phone 
with great camera 1s more accurate and clearer than the current one. 
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